AITopics | short document

Collaborating Authors

short document

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

OTLDA: A Geometry-aware Optimal Transport Approach for Topic Modeling

Neural Information Processing SystemsDec-24-2025, 17:13:00 GMT

We present an optimal transport framework for learning topics from textual data. While the celebrated Latent Dirichlet allocation (LDA) topic model and its variants have been applied to many disciplines, they mainly focus on word-occurrences and neglect to incorporate semantic regularities in language. Even though recent works have tried to exploit the semantic relationship between words to bridge this gap, however, these models which are usually extensions of LDA or Dirichlet Multinomial mixture (DMM) are tailored to deal effectively with either regular or short documents. The optimal transport distance provides an appealing tool to incorporate the geometry of word semantics into it. Moreover, recent developments on efficient computation of optimal transport distance also promote its application in topic modeling. In this paper we ground on optimal transport theory to naturally exploit the geometric structures of semantically related words in embedding spaces which leads to more interpretable learned topics. Comprehensive experiments illustrate that the proposed framework outperforms competitive approaches in terms of topic coherence on assorted text corpora which include both long and short documents. The representation of learned topic also leads to better accuracy on classification downstream tasks, which is considered as an extrinsic evaluation.

geometry-aware optimal transport approach, name change, topic modeling, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.60)

Add feedback

bda5c35eded86adaf0231748e3ce071c-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 16:31:36 GMT

data mining, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.05)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Data Science > Data Mining (0.68)
(2 more...)

Add feedback

NExtLong: Toward Effective Long-Context Training without Long Documents

Gao, Chaochen, Wu, Xing, Lin, Zijia, Zhang, Debing, Hu, Songlin

arXiv.org Artificial IntelligenceJan-22-2025

Large language models (LLMs) with extended context windows have made significant strides yet remain a challenge due to the scarcity of long documents. Existing methods tend to synthesize long-context data but lack a clear mechanism to reinforce the long-range dependency modeling. To address this limitation, we propose NExtLong, a novel framework for synthesizing long-context data through Negative document Extension. NExtLong decomposes a document into multiple meta-chunks and extends the context by interleaving hard negative distractors retrieved from pretraining corpora. This approach compels the model to discriminate long-range dependent context from distracting content, enhancing its ability to model long-range dependencies. Extensive experiments demonstrate that NExtLong achieves significant performance improvements on the HELMET and RULER benchmarks compared to existing long-context synthesis approaches and leading models, which are trained on non-synthetic long documents. These findings highlight NExtLong's ability to reduce reliance on non-synthetic long documents, making it an effective framework for developing advanced long-context LLMs.

large language model, machine learning, preprint arxiv, (20 more...)

arXiv.org Artificial Intelligence

2501.12766

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

OTLDA: A Geometry-aware Optimal Transport Approach for Topic Modeling

Neural Information Processing SystemsOct-11-2024, 12:17:27 GMT

geometry-aware optimal transport approach, short document, topic modeling, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.63)

Add feedback

Attention-Seeker: Dynamic Self-Attention Scoring for Unsupervised Keyphrase Extraction

Z., Erwin D. López, Tang, Cheng, Shimada, Atsushi

arXiv.org Artificial IntelligenceSep-17-2024

This paper proposes Attention-Seeker, an unsupervised keyphrase extraction method that leverages self-attention maps from a Large Language Model to estimate the importance of candidate phrases. Our approach identifies specific components - such as layers, heads, and attention vectors - where the model pays significant attention to the key topics of the text. The attention weights provided by these components are then used to score the candidate phrases. Unlike previous models that require manual tuning of parameters (e.g., selection of heads, prompts, hyperparameters), Attention-Seeker dynamically adapts to the input text without any manual adjustments, enhancing its practical applicability. We evaluate Attention-Seeker on four publicly available datasets: Inspec, SemEval2010, SemEval2017, and Krapivin. Our results demonstrate that, even without parameter tuning, Attention-Seeker outperforms most baseline models, achieving state-of-the-art performance on three out of four datasets, particularly excelling in extracting keyphrases from long documents.

attention-seeker, keyphrase extraction, vector, (12 more...)

arXiv.org Artificial Intelligence

2409.10907

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Dominican Republic (0.04)
(11 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Experiments on Generalizability of BERTopic on Multi-Domain Short Text

de Groot, Muriël, Aliannejadi, Mohammad, Haas, Marcel R.

arXiv.org Artificial IntelligenceDec-16-2022

Topic modeling is widely used for analytically evaluating large collections of textual data. One of the most popular topic techniques is Latent Dirichlet Allocation (LDA), which is flexible and adaptive, but not optimal for e.g. short texts from various domains. We explore how the state-of-the-art BERTopic algorithm performs on short multi-domain text and find that it generalizes better than LDA in terms of topic coherence and diversity. We further analyze the performance of the HDBSCAN clustering algorithm utilized by BERTopic and find that it classifies a majority of the documents as outliers. This crucial, yet overseen problem excludes too many documents from further analysis. When we replace HDBSCAN with k-Means, we achieve similar performance, but without outliers.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2212.08459

Country:

Europe > Netherlands > North Holland > Amsterdam (0.07)
Asia > Middle East > Jordan (0.05)

Genre: Research Report (0.64)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.35)

Add feedback

Hierarchical Heterogeneous Graph Representation Learning for Short Text Classification

Wang, Yaqing, Wang, Song, Yao, Quanming, Dou, Dejing

arXiv.org Artificial IntelligenceOct-30-2021

Short text classification is a fundamental task in natural language processing. It is hard due to the lack of context information and labeled data in practice. In this paper, we propose a new method called SHINE, which is based on graph neural network (GNN), for short text classification. First, we model the short text dataset as a hierarchical heterogeneous graph consisting of word-level component graphs which introduce more semantic and syntactic information. Then, we dynamically learn a short document graph that facilitates effective label propagation among similar short texts. Thus, compared with existing GNN-based methods, SHINE can better exploit interactions between nodes of the same types and capture similarities between short texts. Extensive experiments on various benchmark short text datasets show that SHINE consistently outperforms state-of-the-art methods, especially with fewer labels.

classification, graph, text classification, (15 more...)

arXiv.org Artificial Intelligence

2111.0018

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)
Asia > China > Hong Kong (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry: Media > Film (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Dynamic User Profiling for Streams of Short Texts

Liang, Shangsong (University College London)

AAAI ConferencesFeb-8-2018

In this paper, we aim at tackling the problem of dynamic user profiling in the context of streams of short texts. Profiling users' expertise in such context is more challenging than in the case of long documents in static collection as it is difficult to track users' dynamic expertise in streaming sparse data. To obtain better profiling performance, we propose a streaming profiling algorithm (SPA). SPA first utilizes the proposed user expertise tracking topic model (UET) to track the changes of users' dynamic expertise and then utilizes the proposed streaming keyword diversification algorithm (SKDA) to produce top-k diversified keywords for profiling users' dynamic expertise at a specific point in time. Experimental results validate the effectiveness of the proposed algorithms.

algorithm, expertise, topic model, (16 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report (0.94)

Industry:

Information Technology (0.46)
Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Information Management (0.93)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.40)

Add feedback